Power Analysis of a General Convolution Algorithm Mapped on a Linear Processor Array
نویسندگان
چکیده
We explore the energy dissipation of the Linear Processor Array (LPA) as a function of the number of available resources (Processor Units P) within the array. This number P is an important parameter, as it reflects performance, relates parallel processing to energy dissipation, and influences the scaling of the various parts of the LPA architecture (memory, address generator, communication network). To make a comparison of the different design variants for a fixed datawidth possible, we propose a high-level energy dissipation model of the processor, which is based on a detailed analysis of a general convolution algorithm. It is shown that the energy dissipation of the LPA can roughly be described by the relationship Etotal ∼ N/P with N presenting the datawidth in pixels. This relationship is derived from two observations: first, the largest contribution to Etotal is formed by the energy dissipated by the memories, and second, in our model of the LPA, the datawidth of the memories corresponds with the number of pixels N to be processed, which results in an increase of the access rate when P decreases. Furthermore, we have shown that the energy dissipation caused by communication within the LPA, increases with increasing number of resources: the trade-off between communication versus computation in parallel computing. This turns out to be negligible in the total energy dissipation, and we therefore conclude, that the optimum solution is found, when a full number of resources is applied within the LPA.
منابع مشابه
Design and Implementation of Field Programmable Gate Array Based Baseband Processor for Passive Radio Frequency Identification Tag (TECHNICAL NOTE)
In this paper, an Ultra High Frequency (UHF) base band processor for a passive tag is presented. It proposes a Radio Frequency Identification (RFID) tag digital base band architecture which is compatible with the EPC C C2/ISO18000-6B protocol. Several design approaches such as clock gating technique, clock strobe design and clock management are used. In order to reduce the area Decimal Matrix C...
متن کاملSystolic algorithms for the CMU warp processor
CMU is building a 32-bit floating-point systolic array that can cfficicndy perform many essential computations in signal processing like the FFT and convolution. This is a one-dimensional systolic array that in general takes inputs from one end cell and produces outputs at the other end, with data and control all flowing in one direction. We call this particular systolic array the Warp processo...
متن کاملCMU - CS - 84 - 158 Systolic Algorithms for the CMU Warp Processor
CMU is building a 32-bit floating-point systolic array that can efficiently perform many essential computations in signal processing like the FFT and convolution. This is a one-dimensional systolic array that in general takes inputs from one end cell and produces outputs at the other end, with data and control all flowing in one direction. We call this particular systolic array the Warp process...
متن کاملA Discrete Singular Convolution Method for the Seepage Analysis in Porous Media with Irregular Geometry
A novel discrete singular convolution (DSC) formulation is presented for the seepage analysis in irregular geometric porous media. The DSC is a new promising numerical approach which has been recently applied to solve several engineering problems. For a medium with regular geometry, realizing of the DSC for the seepage analysis is straight forward. But DSC implementation for a medium with ir...
متن کاملA parallel Viterbi decoder for block cyclic and convolution codes
We present a parallel version of Viterbi’s decoding procedure, for which we are able to demonstrate that the resultant task graph has restricted complexity in that the number of communications to or from any processor cannot exceed 4 for BCH codes. The resulting algorithm works in lock step making it suitable for implementation on a systolic processor array, which we have implemented on a field...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- VLSI Signal Processing
دوره 37 شماره
صفحات -
تاریخ انتشار 2004